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A-OPTIMALITY  FOR  REGRESSION  DESIGNS 


N.  N.  Chan 


1.  Introduction. 

Consider  the  linear  regression  model 

y  -  X|3  +  e  , 

where  y  is  an  m  x  1  vector  of  observations,  X  is  an  m  x  n  matrix 
to  be  called  the  design  matrix,  8  is  an  n  x  l  vector  of  unknown 
parameters,  and  e  is  an  m  x  1  vector  of  random  variables  with  mean 
the  m  x  l  zero  vector  and  known  covariance  matrix  A.  We  assume  that 
m  >  n  and  denote  the  eigenvalues  of  A  in  ascending  order  of  magnitude 
by 

X,  <  X.  <  •  •  •  <  X  <  •  • .  <  X 

For  later  use  denote  the  diagonal  matrices  with  diagonal  elements 
X^,...,X^  by  A^,  i  “  n  and  m  , 

For  a  given  design  matrix  X  of  rank  n,  an  unbiased  estimate  of 
the  parameter  8  based  on  the  observation  y  is  the  simple  least  squares 
estimate 

(X'X)"1X’y  , 

whose  covariance  matrix  is  given  by 


(1) 


(X'X)-1X' AX(X'X)-1 


r 


One  of  the  design  problems  is  to  choose  X  from  a  given  experimental 
region  such  that  the  trace  of  the  matrix  in  (1)  is  minimal.  This  is  a 
problem  in  the  A-optimal  designs  of  regression  experiments  and  was  con¬ 
sidered  by  Dorogovcev  (1971)  under  the  more  general  setting  that  the 
observations  are  the  realization  of  stochastic  processes.  Earlier  work 
on  A-optimal  designs  was  given  by  Elfving  (1952)  and  Chernoff  (1953)  . 

In  this  paper  the  experimental  region  under  consideration  is  taken 
to  be  the  set  H  of  all  m  x  n  real  matrices  of  rank  n  whose  i1"*1 
column  has  a  Euclidean  norm  not  exceeding  c^,  i  =  l,...,n,  where  the 
c^  are  given  positive  numbers.  In  section  2,  it  is  shown  that  for  any 
matrix  X  in  H  the  trace  of  the  matrix  in  (1)  has  as  a  lower  bound  of 


In  section  3,  a  necessary  and  sufficient  condition  for  the  existence  of  an 
X  in  H  to  attain  the  lower  bound  is  derived.  For  the  case  in  which 
all  the  c^  are  equal,  a  partial  result  was  given  in  Chan  and  Wong  (1981). 
Dorogovcev  (1971)  obtained  the  lower  bound  for  the  special  case  n  =  2 
and  c^  »  c2- 

It  is  worth  noting  that  in  the  regression  model  if  one  considers 
the  best  linear  unbiased  estimate  (X'A  ^X)  ^X'A  ^y  and  its  covariance 
matrix  (X'A  ^X)  \  by  minimizing  the  trace  of  the  latter  for  all  X  in 
H,  the  corresponding  optimal  design  problem  has  a  simple  solution,  as  is 
given  in  Rao  (1973,  p.  236).  On  the  other  hand,  if  one  wishes  to  minimize 
the  determinant  of  (X'A  ^X)  there  is  the  so  called  D-optimal  design 


I  '  '  : 


2 


problem,  of  which  comprehensive  reviews  can  be  found  in  St.  John  and 
Draper  (1975)  and  Kiefer  and  Galil  (1980). 


2.  An  Inequality. 

For  the  regression  model  and  the  set  H  as  given  in  section  1,  we 
note  that  in  minimizing  the  trace  of  the  matrix  in  (1)  with  respect  to 
X  in  H,  the  matrix  A  in  (1)  can  be  replaced  by  the  diagonal  matrix 

A  without  loss  of  generality,  in  view  of  the  existence  of  an  orthogonal 

m 

matrix  P  such  that 

A  =  P'A  P 
m 

and  the  following  equality 

(X,X)"1X,AX(X,X)"1  -  (Y'Y)"1Y,AtaY(Y,Y)'1  , 

where  Y  »  PX  which  is  again  in  H.  The  following  lemma  of  Fan  (1949) 
will  be  required  in  the  proof  of  our  main  inequality. 

Lemma  1.  Let  B  be  a  real  m  x  n  matrix  whose  n  columns  form 
an  orthonormal  set.  Then 


tr  B'AB  >  tr  A  , 

—  n 

where  tr  represents  the  trace  operation. 

Theorem  1.  For  any  X  in  H,  j 

tr{(X,X)_1X'AX(X'X)"1}  >  (  l  c*)-1(  l  A*?)2  . 

i-1  1  i-1  1 

;  J 


! 

Proof .  By  the  Cauchy-Schwarz  inequality  applied  to  the  trace 
inner  product  tr{X'Y}  between  two  real  m  *  n  matrices  X  and  Y, 
we  have 

(2)  tr(x’X}  x  tr{(X'X)“1X'A;5AisX(X,X)_1}  >  tr^X’A^XfX’X)-1}  . 

But  the  trace  on  the  right-hand  side  is 

(3)  trKx'X^X’A^XCX'X)-5*}  , 

n  ^ 

which  is  not  less  than  by  Lemma  1  .on  noting  that  the  n  columns 

of  the  matrix  X(X'X)  are  orthonorraal.  By  the  definition  of  the  set 
H, 

(4)  tr{X'X}  <  l  c2.  . 

i-1  1 

Hence  the  main  inequality  follows. 

3.  A-Optimal  Designs. 

The  main  result  of  this  work  is  to  obtain  a  necessary  and  sufficient 

condition  on  )  and  (c,,...,c  )  for  the  existence  of  a 

i.  m  in 

matrix  in  H  such  that  the  lower  bound  in  Theorem  1  is  attained.  For 
this  we  need  the  following  lemmas. 

Laiwim  2 .  Let  D  be  an  n  x  n  real  diagonal  matrix  with  diagonal 

elements  d,  <  d.  <  •••  <  d  ,  and  a...... a  be  n  real  numbers  such 

1  —  2  —  -  n  1*  n 

that  a,  <  a.  <  • • •  <  a_  and 


4 


n 


l  * 

i-1 


i 


n 


Then  there  exists  an  n  x  n  orthogonal  matrix  P  such  that  the  n 
diagonal  elements  of  P'DP  are  a^ . an  if  and  only  if 

k  k 

I  d,  ,  k“l,2,...,n-l 

i-1  1  i-1 


j  This  lemma  is  a  version  of  a  result  by  Horn  (1954)  and  a  proof  is 

given  by  Mirsky  (1958).  See  also  Marshall  and  Olkin  (1979,  p.  220). 


Lemma  3.  Let  D  be  as  in  Lemma  2  and  B  be  an  n  x  k  matrix  whose 
k  columns  form  an  orthonormal  set.  Arrange  the  eigenvalues  of  the  k  x  k 
matrix  B'DB  in  ascending  order  £.  l>2  i  Then  b^  2l 

i  *  1, . . . ,k. 

This  is  the  Poincare  separation  theorem  and  can  be  found  for  example 
in  Rao  (1973,  p.  64). 


Theorem  2.  Suppose  that  the  positive  numbers  c^,  i  *  l,...,n, 
are  arranged  in  ascending  order  of  magnitude  and  that  the  smallest  eigen¬ 
value  X^  of  the  covariance  matrix  A  is  positive.  Then  there  is  an 
X  in  H  such  that 

tr{(X'X)“1X,AX(X,X)“1}  -  (  l  ch-h  l  \))2  , 

i-1  1  i-1 


if  and  only  if 
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Proof .  Sufficiency:  Consider  the  diagonal  matrix  whose 

diagonal  elements  are  i  =  l,...,n.  By  Lemma  2  there  exists  an 

orthogonal  matrix  P  of  order  n  such  that  the  iC^  diagonal  element  of 

U  2 

P'A  P  is  be.,  i  =  1, . . . ,n,  where 
n  i 


n  ,  n  . 
2^-1/  r  ,4- 


b  -dcp-\zxp 

i=i  1  i=i  1 


Denote  by  X  the  m  x  n  matrix 

aVI 

n 

> 

0 

where  0  is  an  (m  -  n)  x  n  submatrix  of  zeros.  Note  that  X  is  of 
rank  n  as  >  0  and  that  the  i^  diagonal  element  of  X'X  equals 
c^  as  we  have 

X'X  =  b^P'A^P  . 


Hence  X  is  a  member  of  the  set  H.  Moreover,  for  the  diagonal  matrix 

A  of  order  m,  we  have 
m 

X'A  X  -  b_1lP'A^  O' ]A 
m  n  m 


*  b 


-1 


-1 


P'A^A  tSp 
n  n  n 

U 

P'A  2P  , 
n 
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•n 

/ 


and  so 

tr{(X’X)-1X'A  X(X'X)-1}  =  tr{b(P,A_iiP)P,A/2P(P,A"5iP)} 
m  n  n  n 

=*  b  trfP'A^P) 


The  proof  for  sufficiency  is  completed  by  replacing  A^  by  A  as  re¬ 
marked  at  the  beginning  of  Section  2. 

Necessity.  Suppose  that  X,  a  member  of  H,  is  such  that  the 

inequality  in  Theorem  1  becomes  an  equality.  Then  the  three  inequalities 

in  the  proof  of  Theorem  1  reduce  to  equalities.  First,  note  that  the  ith 

2 

diagonal  element  of  the  matrix  X'X  equals  c^  i  -  l,.,.,n,  because 
2 

it  cannot  exceed  c^  (as  X  is  in  H)  and  from  (4) 

n  7 

tr{X'X}  -  l  c  . 

i=l  i 

By  Lemma  2,  it  is  then  enough  to  show  that  are  the  eigen¬ 

values  of  the  n  x  n  matrix  bX'X.  For  this,  note  that  the  Cauchy-Schwarz 
inequality  (2)  becoming  an  equality  implies  that  there  is  a  nonzero  real 
number  d  such  that 

X  -  dA^XU’X)-1  . 

So  we  have 

X'X  -  dX’AliX(X'X)'1  . 
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The  equality  corresponding  to  (3)  then  implies  that 


(5) 


l  =  tr{x'A1'iX(X,X)"1} 
i=l  1 

=  d-1tr{X'X} 


Therefore,  d 


and  so 


bX'X 


x'aSccx'x)-1 


It  remains  to  show  that  the  n  *  n  matrix 
(6)  (  X '  X)  ~^X 1  A^X  ( X '  X) 


\,r 

has  X^,...,A^  as  its  eigenvalues, 
and  using  Lemma  3,  we  see  that  the 
in  (6)  is  not  less  than  A^,  i  =  1, 
equality  in  (5),  must  be  equal  to 


In  fact,  by  replacing  A  by  Affl 
i^  smallest  eigenvalue  of  the  matrix 
...,n,  and,  in  view  of  the  first 
A*,  completing  the  proof. 
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